Banded Dotterel Database Management

Authors
Affiliation

Department of Ornithology, Max Planck Institute for Biological Intelligence, Seewiesen, Germany

Mihai Valcu

Department of Ornithology, Max Planck Institute for Biological Intelligence, Seewiesen, Germany

Published

February 5, 2025

Prerequisites

R packages

install Mihai’s dbo and apis packages from the mpio-be GitHub repository

remotes::install_github("mpio-be/dbo")
remotes::install_github("mpio-be/apis")

The following packages are needed for the code in this document to run and can be easily installed from CRAN by running the following code chunk:

# Set a CRAN mirror before installing packages
options(repos = c(CRAN = "https://cloud.r-project.org"))

# a vector of all the packages needed in the project
packages_required_in_project <-
  c("dbo", "DBI", "tidyverse", "hms", "sf", 
    "xml2", "DT", "openxlsx", "here", "googlesheets4", 
    "ggmap")

# of the required packages, check if some need to be installed
new.packages <- 
  packages_required_in_project[!(packages_required_in_project %in% 
                                   installed.packages()[,"Package"])]

# install all packages that are not locally available
if(length(new.packages)) install.packages(new.packages)

# load all the packages into the current R session
lapply(packages_required_in_project, require, character.only = TRUE)

set the home directory to where the project is locally based (i.e., to find the relevant datasets to import, etc.)

here::set_here()

Set up database access

In this case we want to set up our access to the FIELD_2024_BADOatNZ database hosted on scidb.mpio.orn.mpg.de.

First run the my.cnf() to access your credentials

my.cnf()

For example, your my.cnf() file could contain the following information in which you would need to enter your username (‘user’) and password:

[scidb_replica] host=134.76.24.182 user=xxx password=xxxx

[nz_fieldworker] host=localhost user=xxx password=xxx database=FIELD_2024_BADOatNZ

Once your credentials are stored you will be able to establish a connection to the entire database that you have access to. Note that if you run dbcon() outside of Seewiesen you will get the following warning message, which indicates you are using scidb_replica: Warning message: In dbcon() : Connection to “scidb” failed, using “scidb_replica”. This simply lets you know that you are using scidb_replica - you can ignore this message and proceed

con = dbcon()

To connect specifically to the FIELD_2024_BADOatNZ database use SQL syntax within the dbExecute() function:

dbExecute(con, "USE FIELD_2024_BADOatNZ")
[1] 0

Now you can query within a specific table of the FIELD_2024_BADOatNZ database. For example to access all records in the CAPTURES table:

captures_2024 <- dbq(con, "SELECT * FROM CAPTURES")

To produce and interactive table of the the data, we can use the datatable function from the DT package:

captures_2024 %>% 
  datatable(class = 'cell-border stripe', rownames = FALSE, filter = 'top')

Merging tables

We will often need to merge two or more tables from the database based on a common relational variable. For example, one might want to get all the tag deployment locations.

Example: merge CAPTURES and GPS tables

The spatial information of each observation in the CAPTUREtable is stored in the GPS table as the Join Keys “gps_id” and “gps_point”. Using dplyr syntax, we can merge these two tables based on the “gps_id” and “gps_point” columns. Check out the interactive table and confirm that the lat and lon columns have now been added:

dbq(con, "SELECT * FROM CAPTURES") %>% 
  left_join(., dbq(con, "SELECT * FROM GPS_POINTS"), by = c("gps_id", "gps_point")) %>% 
  datatable(class = 'cell-border stripe', rownames = FALSE, filter = 'top')

A useful scenario to consider is the need to make a table of all the tag deployment locations, times, and dates. To do this we would consolidate the output of the CAPTURES and GPS merge by taking the following wrangling steps:

tag_deployments_2024 <- 
  # merge the `CAPTURES` and `GPS` by "gps_id" and "gps_point"
  dbq(con, "SELECT * FROM CAPTURES") %>% 
   left_join(., dbq(con, "SELECT * FROM GPS_POINTS"), by = c("gps_id", "gps_point")) %>% 
  
  # exclude captures that has no `tag_id`
  filter(!is.na(tag_id)) %>% 
  
  # assign only one capture time (i.e., some observations have missing data for `cap_start`, `caught`, `released`)
  rowwise() %>%
  mutate(time = as.hms(max(cap_start, caught, released, na.rm = TRUE))) %>%
  
  # subset to the first capture
  group_by(ring) %>%
  arrange(date, time) %>%
  slice(1) %>%
  ungroup() %>%
  
  # clean up the leg marking columns to a) remove erroneous white space
  mutate(across(c(UL, LL, UR, LR), ~ gsub("[,[:space:]]", "", .))) %>%
  
  # b) add "FW" to the start of each string on the `UL` that has only 2 characters 
  # (i.e., FW stands for "flag-white" and should be placed in front of the 
  # 2-character string engraved on each flag) 
  mutate(UR = if_else(nchar(UR) == 2, paste0("FW", UR), UR))  %>%
  
  # c) put a "M" if both tiba are blank (i.e., in cases when it was forgotten 
  # to enter the location of the "M")
  mutate(across(c(LL, LR), ~ str_replace(., "M", ""))) %>%
  
  # d) replace empty data with an "X"
  mutate(across(c(UL, LL, UR, LR), ~ if_else(is.na(.), "X", .)))  %>%
  
  # e) create a column called "mark" which combines the color bands into a combo (e.g., BO-WR) or gives the flag code
  mutate(mark = case_when(
    nchar(LL) == 2 & nchar(LR) == 2 ~ paste0(LL, "-", LR),  # LL and LR have 2 characters
    nchar(UL) == 4 ~ UL,  # UL has 4 characters
    nchar(UR) == 4 ~ UR,  # UR has 4 characters
    TRUE ~ NA_character_  # If none of the conditions are met, assign NA
  )) %>%
  
  # f) if the mark is a flag then replace the "FW" with "flag"
  mutate(mark = if_else(
    str_starts(mark, "FW"),  # Check if the string starts with "FW"
    str_replace(mark, "^FW", "Flag-"),  # Replace "FW" with "Flag-" at the start
    mark  # Otherwise, keep the original value
  )) %>% 
  
  # set the time to local NZ time
  mutate(local_deployment_time = as.POSIXct(with_tz(ymd_hms(paste(date, time, sep = " "), tz = 'Pacific/Auckland'), 'Pacific/Auckland'))) %>% 
  
  # consolidate to the columns of interest
  select(site, local_deployment_time, field_sex, ring, mark, tag_id, tag_type, lat, lon) %>% 
  
  # sort by deployment date
  arrange(local_deployment_time)

we can see the resulting table here

tag_deployments_2024 %>% 
  datatable(class = 'cell-border stripe', rownames = FALSE, filter = 'top')

and we can export the table to excel like this (note that Excel will ignore the ‘Pacific/Auckland’ timezone specified in R and will display the local_deployment_time column in UTC)

write.xlsx(tag_deployments_2024, 
           file = here("data/bdot_tag_deployments_2024.xlsx"), 
           sheetName = "Sheet1", 
           rowNames = FALSE)

Example: make a table of all combos and flags assigned

Tony Habraken provides excellent resighting information from the Auckland region. He has recently asked me for a list of all the potential combinations and flags assigned to aid him in the field. Here are some commented steps to wrangle the data into a format for Tony.

first import the pre-2024 captures and tidy it up

caps_pre2024 <-
  # import the pre-2024 captures table of the Google Sheets database prepared
  read_sheet("https://docs.google.com/spreadsheets/d/1Tp26Z23HSXXZSoGXD4dbP3xukhrY1kWhzQrbtRHt4EY/edit?usp=sharing",
             sheet = "Captures", col_types = "c") %>%

  # subset to Banded Dotterels
  filter(species == "BADO") %>%

  # pad the date column so that there is a zero added in front of cases with
  # only 3 characters (e.g., 1st Oct is 110, but should be 0110)
  mutate(date = str_pad(date, side = "left", width = 4, pad = "0")) %>%

  # format the date column (if there are data entered)
  mutate(date = ifelse(!is.na(date),
                       paste(year, substring(date, first = 3, last = 4),
                                   substring(date, first = 1, last = 2),
                             sep = "-"), NA) %>% as.Date()) %>%

  # pad the time column so that there is a zero added in front of cases with
  # only 3 characters (e.g., 110 is 1:10AM, but should be 0110)
  mutate(time = str_pad(time, width = 4, side = "left", pad = "0")) %>%
  mutate(time = ifelse(!is.na(time),
                       paste0(substr(time, 1, 2), ":",
                              substr(time, 3, 4), ":00"), NA) %>% as_hms) %>% 
  
  # bind time and date together and set local timezone
  mutate(local_capture_time = 
           ifelse(!is.na(time), ymd_hms(paste(date, time, sep = " "), 
                                        tz = 'Pacific/Auckland'),
                  NA) %>% as.POSIXct(tz = 'Pacific/Auckland')) %>%
  
  # subset to the first capture of each metal ring and color combo
  group_by(ring, code) %>% 
  arrange(local_capture_time) %>% 
  slice(1) %>% 
  ungroup() %>% 
  
  # make age column consistent such that juveniles and classed as "P" (i.e., pulli)
  mutate(age = ifelse(age == "J", "P", age),
         sex = ifelse(age == "P", "U", sex)) %>% 
  
  mutate(tibia_color = str_extract(code, "(?<=\\|).") %>% if_else(. == "X", NA_character_, .)) %>% 
  
  # make the code column easier to interpret for Tony by putting the colors in XX-XX syntax
  mutate(mark = ifelse(tibia_color != "F" & tibia_color != "M" & !is.na(tibia_color),
                       paste(substr(code, 4, 5), substr(code, 7, 8), sep = "-"),
                       ifelse(nchar(code) == 11, 
                              paste(substr(code, 4, 5), substr(code, 10, 11), sep = "-"),
                              ifelse(nchar(code) == 12,
                                     paste("whiteflag", substr(code, 5, 6), substr(code, 11, 12), sep = "-"),
                                      ifelse(tibia_color == "F",
                                             paste("whiteflag", substr(code, 5, 6), substr(code, 12, 13), sep = "-"), 
                                             NA))))) %>% 
  
  # make a tag column that indicates if the bird ever had a tag
  mutate(tag = ifelse(!is.na(tag) & tag != "-" & tag != "0", 1, 0)) %>%
  
  # consolidate and reduce the relevent columns
  select(population, ring, mark, sex, age, local_capture_time, tag) %>%
  
  # subset to the last capture of each metal ring (i.e., to get the latest mark assigned)
  group_by(ring) %>% 
  arrange(desc(local_capture_time)) %>% 
  slice(1) %>% 
  ungroup() %>%
  
  # sort by deployment date
  arrange(local_capture_time)

Next import the 2024 captures and tidy it up

caps_2024 <-
  dbq(con, "SELECT * FROM CAPTURES") %>% 
  rename(sex = field_sex) %>%
  mutate(age = ifelse(age == "J", "P", age)) %>% 
  mutate(sex = ifelse(age == "P", "U", sex)) %>% 
  
  # rename the site codes for better interpretation
  mutate(population = ifelse(site == "KT", "Kaitorete", 
                             ifelse(site == "PB", "Porters_Beach",
                                    ifelse(site == "KP", "Kena_Peninsula",
                                           ifelse(site == "MS", "Motueka_Spit",
                                                  ifelse(site == "KK", "Kaikoura",
                                                         ifelse(site == "TS", "Tasman_Valley", 
                                                                ifelse(site == "CR", "Cass_River",
                                                                       ifelse(site == "TA", "Te_Anau",
                                                                              ifelse(site == "TP", "Tiwai_Point", 
                                                                                     ifelse(site == "CH", "Chathams", 
                                                                                            ifelse(site == "HC", "Hawkes_Bay_coast", 
                                                                                                   ifelse(site == "TO", "Tongirio", NA))))))))))))) %>% 
  
  # assign only one capture time (i.e., some observations have missing data for `cap_start`, `caught`, `released`)
  rowwise() %>%
  mutate(time = as.hms(max(cap_start, caught, released, na.rm = TRUE))) %>%
  
  # subset to the first capture
  group_by(ring) %>%
  arrange(date, time) %>%
  slice(1) %>%
  ungroup() %>%
  
  # clean up the leg marking columns to a) remove erroneous white space
  mutate(across(c(UL, LL, UR, LR), ~ gsub("[,[:space:]]", "", .))) %>%
  
  # b) add "FW" to the start of each string on the `UL` that has only 2 characters 
  # (i.e., FW stands for "flag-white" and should be placed in front of the 
  # 2-character string engraved on each flag) 
  mutate(UR = if_else(nchar(UR) == 2, paste0("FW", UR), UR))  %>%
  
  # c) put a "M" if both tiba are blank (i.e., in cases when it was forgotten 
  # to enter the location of the "M")
  mutate(across(c(LL, LR), ~ str_replace(., "M", ""))) %>%
  
  # d) replace empty data with an "X"
  mutate(across(c(UL, LL, UR, LR), ~ if_else(is.na(.), "X", .)))  %>%
  
  # e) create a column called "mark" which combines the color bands into a combo (e.g., BO-WR) or gives the flag code
  mutate(mark = case_when(
    nchar(LL) == 2 & nchar(LR) == 2 ~ paste0(LL, "-", LR),  # LL and LR have 2 characters
    nchar(UL) == 4 ~ UL,  # UL has 4 characters
    nchar(UR) == 4 ~ UR,  # UR has 4 characters
    TRUE ~ NA_character_  # If none of the conditions are met, assign NA
  )) %>%
  
  # f) if the mark is a flag then replace the "FW" with "flag"
  mutate(mark = if_else(
    str_starts(mark, "FW"),  # Check if the string starts with "FW"
    str_replace(mark, "^FW", "whiteflag("),  # Replace "FW" with "whiteflag(" at the start
    mark  # Otherwise, keep the original value
  )) %>% 
  
  mutate(mark = if_else(
    str_starts(mark, "whiteflag"),
    paste0(mark, ")"), 
    mark  # Otherwise, keep the original value
  )) %>% 
    
  mutate(mark = ifelse(age == "P" & nchar(LR) == 1 & LR != "X", paste0("XX-", LR, "X"), mark)) %>% 
  mutate(mark = ifelse(age == "P" & nchar(LL) == 1 & LL != "X", paste0(LL, "X", "-XX"), mark)) %>% 
  mutate(mark = ifelse(str_detect(UR, "FO"), paste0("orangeflag-", mark), mark)) %>%
  
  # set the time to local NZ time
  mutate(local_capture_time = as.POSIXct(with_tz(ymd_hms(paste(date, time, sep = " "), tz = 'Pacific/Auckland'), 'Pacific/Auckland'))) %>% 
  
  # make a tag column that indicates if the bird ever had a tag
  mutate(tag = ifelse(!is.na(tag_id) & tag_id != "-" & tag_id != "0", 1, 0)) %>% 
  
  # # consolidate and reduce the relevent columns
  select(population, ring, mark, sex, age, local_capture_time, tag) %>%
  
  # sort by deployment date
  arrange(local_capture_time)

bind the two tables together and remove multiple observations of the same individual

# bind the pre-2024 and the 2024 caps together
all_banded_bdots <- 
  bind_rows(caps_pre2024, caps_2024) %>% 
  # subset to the latest capture
  group_by(ring) %>%
  arrange(desc(local_capture_time)) %>%
  slice(1) %>%
  ungroup()

export the table as an Excel sheet

write.xlsx(all_banded_bdots %>% 
             rename(UTC_capture_time = local_capture_time) %>% 
             filter(!is.na(mark)) %>% arrange(UTC_capture_time), 
           file = here("data/marked_bdots_Feb_2025.xlsx"), 
           sheetName = "Sheet1", 
           rowNames = FALSE)

out of curiosity, check the number of birds with unique markings

all_banded_bdots %>% 
  filter(str_detect(mark, "XX", negate = TRUE)) %>% nrow()
[1] 425

Manually entering a resighting record

When resightings come into via email, social media posts, or FALCON, it is important to add them to the database. For these observations we will use the RESIGHTINGS_PUBLIC table. To access the database editor, go to http://behavioural-ecology.orn.mpg.de/db_ui/field_db.php, you should see the following screen where you can enter the credentials for the Banded Dotterel database (username: bdot, password: XXX).

Next select the FIELD_2024_BADOatNZ database

Next select the RESIGHTINGS_PUBLIC table

Next click New item

Now you can enter the resighting record, such as this one posted in the “New Zealand Bird Identification” Facebook group:

The location that these two birds were seen was “Motueka Estuary”. We can use the geocode() function in the ggmap package to get the latitude and longitude for this location

location <- "Motueka Estuary, New Zealand"
geo_result <- 
  geocode(location, output = "latlona", source = "google") %>% 
  mutate(lon = round(lon, 2),
         lat = round(lat, 2))
as.data.frame(geo_result)
     lon    lat                                    address
1 173.02 -41.13 motueka estuary, motueka 7120, new zealand

Now you enter all the data you can glean from the Facebook post, and click “Save” (or click “Save and insert next” if you have another immediate resighting to enter).

Note that I’ve downloaded the original photo from Facebook, renamed it as “2025-02-04_MA_Motueka.jpg”, and put it in the following OwnCloud directory: https://owncloud.gwdg.de/index.php/s/nc0v301s0OUyVMR This is where any photos from a resighting can be stored.

Note also that the text inserted in the comments_db is: “from New Zealand Bird Identification Facebook group; reported with FWM2”. I’ve specified here the exact source within Facebook and I’ve also made note that another bird was seen with it. comments_db is essentially a dedicated space for the database manager to make a relevent comment about the resighting.

Editing a resighting observation

If you need to correct an observation, you can click “Select data” in the Table: RESIGHTINGS_PUBLIC page

then click “edit” on the observation you’d like to modify.